X-Pipeline: An analysis package for autonomous gravitational-wave burst searches 
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Autonomous gravitational-wave searches - fully automated analyses of data that run without 
human intervention or assistance - are desirable for a number of reasons. They are necessary for 
the rapid identification of gravitational-wave burst candidates, which in turn will allow for follow- 
up observations by other observatories and the maximum exploitation of their scientific potential. 
A fully automated analysis would also circumvent the traditional "by hand" setup and tuning of 
burst searches that is both labourious and time consuming. We demonstrate a fully automated 
search with X-Pipeline, a software package for the coherent analysis of data from networks of 
interferometers for detecting bursts associated with GRBs and other astrophysical triggers. We 
discuss the methods X-Pipeline uses for automated running, including background estimation, 
efficiency studies, unbiased optimal tuning of search thresholds, and prediction of upper limits. 
These are all done automatically via Monte Carlo with multiple independent data samples, and 
without requiring human intervention. As a demonstration of the power of this approach, we apply 
X-PlPELlNE to LIGO data to compute the sensitivity to gravitational-wave emission associated with 
GRB 031108. We find that X-Pipeline is sensitive to signals approximately a factor of 2 weaker in 
amplitude than those detectable by the cross-correlation technique used in LIGO searches to date. 
We conclude with the status of running X-Pipeline as a fully autonomous, near real-time triggered 
burst search in the current LSG- Virgo Science Run. 

PACS numbers: 04.80.Nn, 95.55.Ym, 07.05.Kf 



I. INTRODUCTION 

Gravitational- wave bursts (GWBs) are one of the most 
interesting classes of signals being sought by the new gen- 
eration of gravitational- wave detectors. Possible sources 
include core-collapse supernovae [T] , the merger of bina- 
ries containing black-holes or neutron-stars [2], gamma- 
ray bursts [3j, and other relativistic systems; see [4] for 
a brief overview. These systems typically involve matter 
at neutron-star densities and very strong gravitational 
fields, making GWBs potentially rich sources of informa- 
tion on relativistic astrophysics. 

The maximum exploitation of a GWB detection would 
occur when the system is observed by other "messengers" 
besides gravitational waves, such as in optical, gamma 
rays, or neutrinos f5|. Indeed, the first detection of a 
GWB might rely on independent confirmation by other 
observatories, and efforts are underway to develop col- 
laborations between gravitational-wave detectors, elec- 
tromagnetic telescopes, and neutrino observatories (see 
for example [6l [7] ) . The rapid and confident identifica- 
tion of candidate GWBs by gravitational-wave detectors 
will be vital for these efforts. 
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Unfortunately, the analysis of gravitational-wave data 
tends to be a slow process, with a typical latency of 
several years between the collection of the data and 
the publication of results. For example, searches for 
gravitational-wave transients in the first year (2005- 
2006) of the LIGO Science Run 5 / Virgo Science Run 
1 (S5-VSR1) have only recently been published [H 
One of fastest such analyses has been the search for a 
gravitational-wave signal associated with GRB 070201 
[lOj . which was published 9 months after the event. 

The rapid analysis of gravitational-wave data is not 
trivial, particularly given the non-stationary nature of 
the background noise in gravitational- wave detectors and 
the lack of accurate and comprehensive waveform models 
for GWB signals. Specifically, we need methods capable 
of detecting weak signals with a priori unknown wave- 
forms, yet which are simultaneously insensitive to the 
background noise "glitches" that are common in data 
from gravitational-wave detectors. Glitch rejection is 
particularly important since it is the limiting factor in 
the sensitivity of current burst searches, and a confi- 
dent detection of a GWB will depend critically on ro- 
bust background estimation. Detector characterisation 
[TTl [T2] and search optimization tend to be laborious and 
time-consuming, as is accounting for other systematic ef- 
fects such as uncertainties in detector calibration. 

These considerations motivate the deployment of data 
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analysis packages that can process data rapidly, yet com- 
prehensively. The ideal scenario is a fully autonomous 
search - one that runs continuously and without human 
intervention. This requires an analysis that is self-tuning, 
adjusting search parameters to changes in the detector 
network and accounting for variations in the properties 
of the background noise around the time of candidate 
events. 



We present X-Pipeline [T31 E], a software pack- 
age designed for autonomous searches for unmodelled 
gravitational-wave bursts. X-Pipeline targets GWBs 
associated with external astrophysical "triggers" such 
as gamma-ray bursts (GRBs), and has been used to 
search for GWBs associated with more than 100 GRBs 
that were observed during S5-VSR1 [TS]. It performs a 
fully coherent analysis of data from arbitrary networks of 
gravitational-wave detectors, while being robust against 
noise-induced glitches. We emphasize the novel features 
of X-PiPELiNE, particularly a procedure for automated 
tuning of the background rejection tests. This allows 
the analysis of each external trigger to be optimized in- 
dependently, based on background noise characteristics 
and detector performance at the time of the trigger, max- 
imizing the search sensitivity and the chances of making 
a detection. This tuning uses independent data samples 
for tuning and estimating the significance of candidate 
events, for unbiased selection of GWB candidates. (See 
also [in] for a Bayesian-inspired technique for automated 
tuning.) X-PlPELlNE can also account automatically for 
effects like uncertainty in the sky position of astrophys- 
ical trigger and detector calibration uncertainties. Fur- 
thermore, for the ongoing S6-VSR2 run, we are preparing 
the next step in the evolution of GWB searches: a fully 
autonomous search, wherein X-Pipeline is triggered au- 
tomatically by email reports of GRBs, and wherein data 
is analysed and candidate GWBs identified without hu- 
man intervention. Our goal is the complete analysis of 
each GRB within 24 hours of the receipt of the GRB 
notice. Such a rapid analysis would be fast enough to 
allow further follow-up observations to be prompted by 
the GWB candidate. 



We begin in Section [Tl] with a brief discussion of the 
theory of coherent analysis in gravitational-wave burst 
detection. In Section |III| we discuss the main steps 
followed in an X-Pipeline triggered coherent search. 
In Section |IV| we demonstrate the sensitivity of X- 
Pipeline on GRB 031108 using actual LIGO data, and 
compare to the upper limits set by the cross-correlation 
technique used in the published LIGO search for grav- 
itational waves associated with the same GRB. In Sec- 
tion [V] we discuss the status of autonomous running of 
X-Pipeline during the current S6-VSR2 science run of 
LIGO and Virgo. We conclude with a few brief comments 
in Section IVll 



II. COHERENT ANALYSIS FOR 
GRAVITATIONAL- WAVE BURST DETECTION 

Most algorithms currently used in gravitational-wave 
burst detection can be grouped into two broad classes. In 
incoherent methods [T71 [TB], candidate events typically 
are constructed from each detector data stream indepen- 
dently, and one looks for events with similar duration 
and frequency band that occur in all detectors simul- 
taneously. By contrast, coherent methods [TH [T71 
I33| combine data from multiple detectors before process- 
ing, and create a single list of candidate events for the 
whole network. Coherent methods have some advantages 
over incoherent methods, such as demonstrated useful- 
ness in rejecting background noise "glitches" P^[^[55] . 
and for reconstructing GWB waveforms [El |32] . A less- 
recognized advantage of coherent methods is that they 
are relatively easy to tune. For example, time-frequency 
coincidence windows for comparing candidate GWBs in 
different detectors are not necessary. Detectors are natu- 
rally weighted by their relative sensitivity, so there is no 
need to tunc the relative thresholds for generating candi- 
date events in each detector. This ease of tuning makes 
coherent methods particularly useful for rapid searches. 

That said, there are also draw-backs to coherent meth- 
ods, the most significant being computational cost. Co- 
herent combinations are typically a function of the sky 
position of the GWB source; there are ^ 10^ resolvable 
directions on the sky for a worldwide detector network 
[34) . This cost is compounded by the need to estimate 
the background due to noise, which requires repeated re- 
analysis of the data using time shifts. Fortunately, in 
triggered searches the sky position of the source is of- 
ten known to high accuracy, and the amount of data to 
be analysed is relatively small (typically hours), so the 
computational cost of a fully coherent analysis is modest. 
This allows triggered searches to take advantage of the 
benefits of coherent methods while avoiding or minimiz- 
ing most of the drawbacks. 

In this section we give a brief review of some of the 
main principles of coherent network analysis as imple- 
mented in X-PlPELINE. 



A. Formulation 

A rigorous treatment of gravitational waves is based on 
linearized perturbations of the spacetime metric around 
a fixed background (see for example |35j). In the lin- 
earized theory based on flat spacetime, when working 
in a suitable gauge, the perturbations representing the 
gravitational waves can be shown to obey the ordinary 
wave equation. The gravitational waves are transverse, 
and travel at the speed of light. They have two inde- 
pendent polarizations, commonly referred to as "plus" 
(-I-) and "cross" (x). Their physical manifestation is 
a quadrupolar change in the distance between freely 
falling test particles (approximated in interferometric 
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gravitational-wave detectors by the mirrors in the inter- 
ferometer arms). Exphcit definitions of the plus and cross 
polarization states can be found, for example, in [17 . 

The interferometers currently used to try to detect 
these waves are based on a laser, beamsplitter, and mir- 
rors at the ends of each arm which serve as test masses. 
Data from each interferometer record the length differ- 
ence of the arms and, when calibrated, measure the strain 
induced by a gravitational wave. The LIGO detectors 
are kilometer-scale power-recycled Michelson interferom- 
eters with orthogonal Fabry-Perot arms [SHI EZ] . There 
are two LIGO observatories: one located at Hanford, 
WA and the other at Livingston, LA. The Hanford site 
houses two interferometers: one with 4 km arms (HI), 
and the other with 2 km arms (H2). The Livingston ob- 
servatory has one 4 km interferometer (LI). The Virgo 
detector (VI) is in Cascina near Pisa, Italy. It is a 3 
km long power-recycled Michelson interferometer with 
orthogonal Fabry-Perot arms [38]. The GEO 600 detec- 
tor [39], located near Hannover, Germany, is also oper- 
ational, though with a lower sensitivity than LIGO and 
Virgo. These instruments are all designed to detect grav- 
itational waves with frequencies ranging from ^ 30 Hz to 
several kHz. 

Consider a gravitational wave h+{t, x), hx {t, x) from a 
direction VL. The output of detector a S [1, . . . , -D] is a 
linear combination of this signal and noise n^: 



d„(i-|- A<„(17)) 



F+{n)h+{t) + F^{n)hAt) 

+ Tla{t + AtaiCl)) . (2.1) 



Here F~^(il,), F^(i}) are the antenna response functions 
describing the sensitivity of the detector to the plus and 
cross polarizations (note that the choice of polarization 
basis is arbitrary; we use the ip = choice of Appendix B 
of \l7_i)- Also, Ata{Cl) is the time delay between the posi- 
tion Ta of detector a and an arbitrary reference position 
To- 



At„(r!) = -(fo 
c 



n. 



(2.2) 



For brevity, we suppress explicit mention of the time de- 
lay and understand the data streams to be time-shifted 
by the appropriate amount prior to analysis. We also 
write /i+,x(i) = /i+,x(i,n))- 

Since the detector data is sampled discretely, we 
use discrete notation henceforth. The discrete Fourier- 
transform x[k] of a time-series x[j] is 



N-l 



x[j] e-'^^^''/^ 



3=0 



N-1 



b1 - ^ E 



J2Tr jk/N 



(2.3) 



where N is the number of data points in the time domain. 
Denoting the sampling rate by fs, we can convert from 



continuous to discrete notation using x{t) x[j], x{f) 

f-'m, Jdt^ fr' Jdf^ fsN-' Efc, Sit ~ t') ^ 

fsSjj', and S{f — f) Nf^^Skk'- For example, the one- 
sided noise power spectral density Sa[k] of the noise fia 
is 



N 



SapSkk'Salk] 



(2.4) 



where the angle brackets indicate an average over noise 
instantiations. 

It is conceptually convenient to define the noise- 
spectrum-weighted quantities 



dwa [k] — 
[k] 



da[k] 



2 [k] 

ngjk] 



F+'-{n) 



The normalization of the whitened noise is [5T] 
(f4„[fc]nw/3[fc']) = SapSkk' ■ 



(2.5) 
(2.6) 
(2.7) 

(2.8) 



With this notation, equation (2.1 ) for the data measured 



from a set of D detectors can be written in the simple 
matrix form 



d^Fh 



n, 



(2.9) 



where we have dropped the explicit indices for frequency 
and sky position. We use the boldface symbols d, F, fi 
to refer to noise-weighted quantities that are vectors or 
matrices on the space of detectors (note that h is not 
noise- weighted and is not in the space of the detectors): 



d = 



and 



dvi2 



h = 



n = 



, (2.10) 



[F+ FX ] = 



^wD ^wD 



(2.11) 



(See Table |l] for a list of the dimensions of all of the 
quantities used in this section.) Note that each of these 
quantities is a function of both frequency and (through 
the antenna response or implied time shift) sky position. 
As a consequence, coherent combinations typically have 
to be re-computed for every frequency bin as well as 
for every sky position. Note also that, because of the 
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noise-spectrum weighting, the whitened noise is isotropi- 
cally distributed in the space of detectors [equation ( 2.8 )] . 
Therefore, all information on the sensitivity of the net- 
work both as a function of frequency and of sky position 
is contained in the matrix F defined by equation (2.11 ). 



Standard Likelihood 



where we define 



_F(Ft_p)-i_pt 



(2.18) 



and we have used the fact that p'^^ is Hcrmitian. (The 
factor of 2 in the definition of i?gL is purely a matter of 
taste.) 



In this section we describe some of the simpler coher- 
ent likelihoods: those that can be computed from pro- 
jections of the data. These are the main ones used for 
signal detection in X-Pipeline. We begin with the sim- 
plest coherent likelihood of all: the standard or maximum 
likelihood, first derived in |171 120| . 

Let P{d\h) be the probability of obtaining the 
whitened data d in one time-frequency pixel in the pres- 
ence of a known gravitational wave h from a known di- 
rection. Assuming Gaussian noise, 



P{d\h) = 



1 



cxp 



d Fh 



For a set {d} of Np time-frequency pixels 



(2.12) 



P{{d}\{h}) = ^--A_exp -^Y.\d[k] - F[k]h[k]\^ 

(2.13)' 

where k indexes the pixels. The likelihood ratio L is 
defined by the log-ratio of this probability to the corre- 
sponding probability under the null hypothesis, 
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d Fh 





P({d}|{0}) 2 

(2.14) 

where P({d}|{0}) is the probability of measuring the 
data {d} when no GWB is present {h = 0). 

In practice, the signal waveform h is not known a pri- 
ori, so it is not clear how to compute the likelihood ra- 



) in each time-frequency pixel as free param- 
The best-fit values /imax are 



tio (2.14|). One approach is to treat the waveform values 
h = (/i+ 

eters to be fit to the data, 
those that maximize the likelihood ratio: 

dL 

dh 







(2.15) 



h=h„ 



Because the likelihood ratio L is quadratic in h, (2.15) 
gives a linear equation for ft.inax 

= {F^^Fy^F^ d 



,. The solution is 



(2.16) 

where we use ^ to denote the conjugate transpose. {F is 
real, but other quantities such as the data vector d are 
complex.) Substituting the solution for /imax in (2.14) 

gives the standard likelihood, 



EsL = 2L(/l^ax) - ^d'P^^d, 



(2.17) 



C. Projection Operators and the Null Energy 



It is easy to show that P is a projection operator 
that projects the data into the subspace spanned by F~^ 
and . We know by equation (pJ) or ([2!9|-(|2.1l| that 



the contribution to d by any gravitational wave from a 
fixed sky position is restricted to this subspace. The 
standard likelihood is therefore the maximum amount of 
energy [52j in the whitened data that is consistent with 
the hypothesis of a gravitational wave from a given sky 
position. 

Contrast this with the total energy in the data, which 
is simply 



Etnt. — 
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I -I 2 

\d\ 



(2.19) 



The total energy is an incoherent statistic in the sense 
that it contains only autocorrelation terms and no cross- 
correlation terms. In the limit of a one-detector network, 
this is the quantity one computes for each time-frequency 
pixel in an excess-power search |17j . 

The projection operator P""'' = {I - P°^), which is 
orthogonal to p^^^ cancels the gravitational- wave sig- 
nal. This yields the null stream with energy 



E, 



null 



Et. 



E> 



SL 



jiiull 



d. 



(2.20) 



The null energy is the minimum amount of energy in the 
whitened data that is inconsistent with the hypothesis of 
a gravitational wave from a given sky position. 

One advantage of coherent analysis is that the pro- 
jection from the full data space with energy i?tot to the 
subspace spanned by F~^, P^ with energy Egi^ removes 
some fraction of the noise, with energy -Enuih without re- 
moving any of the signal component (small errors in cal- 
ibrations, sky position, or power spectra change F but 
this affects the signal energy only at second order) . This 
means that a signal can be detected with higher confi- 
dence. An important caveat is that the full benefit is 
gained only if the sky position is known a priori, such 
as in gamma-ray burst searches. If the sky position of 
the source is not known a priori, one typically repeats 
the calculation of the likelihood for a set of directions 
spanning the entire sky (^ 10"^) directions). Since F+, 
F^ vary with the sky position, this means that many 
different projection operators will be applied to the data. 
This will incur a false-alarm penalty. 
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quantity 


dimensions 


F 

pGW pnuU J 

all other boldfaced symbols: d, d, , etc. 


2x1 vectors 
D X 2 matrix 
D X D matrices 
D X 1 vectors 



TABLE I: Dimensionality of various quantities used in this section. D is the number of detectors in the network. 



D. Dominant Polarization Frame and Other 
Likelihoods 

For a single time-frequency pixel, the data from a set 
of D detectors is a vector in a D-dimensional complex 
space. One basis of this space is formed by the set of 
single-detector strains (the basis in which all equations 
have been written thus far); however, this is not the most 
convenient basis for writing detection statistics. The 2- 
dimensional subspace defined by F~^, is a natural 
starting point for the construction of a better basis. If 
we examine the properties of this 2-dimensional space, we 
find there is a direction (a choice of polarization angle) in 
which the detector network has the maximum antenna re- 
sponse, and an orthogonal direction in which the network 
has minimum antenna response. Choosing those two di- 
rections as basis vectors, and completing them with an 
orthonormal basis for the null space, yields a very conve- 
nient basis in which to construct detection statistics. To 
further simplify things it is possible to define the -t-, x 
polarizations so that lies along the first basis vector, 
andi^^ along the second. This choice of polarization def- 
inition is called the dominant polarization frame or DPF 
Note that while searches for modeled signals 
such as binary inspirals often select the polarization basis 
with reference to the source, the DPF polarization basis 
is tailored to the detector network at each frequency. This 
makes it a particularly convenient choice when searching 
for more general gravitational-wave burst signals. 

To see how one constructs the DPF, recall that the 
antenna response vectors in two frames separated by a 
polarization angle "0 a-re related by 

F+ii;) = cos2i/'-F+(0) +sin2?/'F^(0), (2.21) 
F^itP) = -sin2V'F+(0) + cos2V'F^(0) (2.22) 

(see for example equations (B9), (BIO) of [17 ). It is 
straightforward to show that for any direction on the 
sky, one can always chose a polarization frame such 
that F~^{ip) and F^{ip) are orthogonal and |F+(i/))| > 
\F^{ip)\. Explicitly, given F+(0), F^(0) in the origi- 
nal polarization frame, the rotation angle "^dp giving the 
dominant polarization frame is 

il]Dp{^:k) = ^atan2(2F+(0) •F^(0), 

|F+(0)|'- |F><(0)|2) . (2.23) 

where atan2(2/,a;) is the arctangent function with range 
(— TT, tt]. Note that ipop is a function of both sky position 
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FIG. 1; Space of detector strains for the 3-detector case for 
one data sample. The green plane is the plane spanned by 
the antenna response vectors f^, The thick magenta 

line is the vector of detector strains d for one realization of 
noise and signal. The dashed lines show the projection of the 
data vector into the detector response plane and into the null 
space. 



and frequency (through the noise weighting of F~^ and 
FX). 

We denote the antenna response vectors in the DPF by 
the lower-case symbols /+, . They have the properties 

|/+|2>|/x|2, (2.24) 

=0. (2.25) 

In the DPF the unit vectors e+ = /+/|/+|, = 
/^/l/^l are part of an orthonormal coordinate system; 
see Figure [l] Indeed, the DPF can be viewed as the nat- 
ural coordinate system in the space of detector data for 
understanding the sensitivity of the network. Mathemat- 
ically, rotating to the DPF is the same as doing a singular 
value decomposition of the matrix F . The singular values 
are \ f^\'^ and |/^P; i.e., the magnitudes of the antenna 
response evaluated in the DPF. 

It should be noted that the DPF does not specify any 
particular choice of basis for the null space. Convenient 
choices for the null basis can be motivated by how the 
null energy is used in the search, but we do not consider 
this issue here. 

In the DPF, the projection operator takes on the 

very simple form 

pGW^g+g+t^gXgXf ^ ^2.26) 
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The standard likelihood (2.171 becomes 



EsL — 



E 





2 




2" 


e+ • d 


+ 


d 





(2.27) 



where we use the notation a • 6 to denote the familiar 
dot product between 13 x 1 dimensional vectors a and b. 
The plus energy or hard constraint likelihood [21, 22 is 
the energy in the polarization in the DPF: 



d 



The cross energy is defined analogously: 

_ 2 



• d 



(2.28) 



(2.29) 



The soft constraint likelihood [211 122j (not a projection 
likelihood) is 



soft 



E 



• d 



where the weighting factor e is defined in the DPF as 

^€[0,1]. (2.31) 



e — 



Typical values are e ~ 0.01 — 0.1 for the LIGO network. 

Numerous other likelihood-based coherent statistics 
have been introduced in the literature, such as the 
Tikhonov regularized statistic (23] , a sky-map variability 
statistic [35], and modified constraint likelihood statistics 
|25j . Also, comprehensive Bayesian formulations of the 
problem of GWB detection and waveform estimation are 
described in H?! [32] . While some of these statistics 
are available in X-Pipeline, we do not consider them 
here. 



E. Statistical Properties 

One convenient property of the projection likelihoods 
E^, i?x, Esh, Enuii, Etot is that their statistical prop- 
erties for signals in Gaussian background noise are very 
simple. Specifically, for a set of time-frequency pixels 
and a sky position chosen a priori, each of these ener- 
gies follows a distribution with 27VpDpi.oj degrees of 
freedom: 



2E ' 



,(A). 



(2.32) 



Here A^p is the number of pixels (or time-frequency vol- 
ume), i^proj is the number of dimensions of the projec- 
tion, which is 1 for E+,Ex, 2 for i?sL, and D for E^tot- 
Dproj = D — 2 for EnnW, except when the null stream is 
constructed as the difference of the data streams from the 
two co-aligned LIGO-Hanford detectors, HI and H2, in 



which case it is D — 1 (the H1-H2 sub- network is only sen- 
sitive to a single gravitational- wave polarization, so only 
one dimension is removed in forming the null stream). 
The factor of 2 in the degrees of freedom occurs because 
the data are complex. The non-centrality parameter A is 
the expected squared signal-to-noise ratio of a matched 
filter for the waveform restricted to the time-frequency 
region in question [53| and after projection by the appro- 
priate likelihood projection operator, summed over the 
network: 



A+ = 2j2\n'\h, 



N 



a k 



Sa [k] 



Ax = 2j2\rnh> 



I _ 2 

4 \F^{^Dp)h^[kK^pDp) 



N^^ S^[k] 



(2.30) AsL - 2Y^[\f+\'\h+\^ + \n'\h^\'] 



4' \F+h+[k]+F-h^[k" 

a k '- ' 



\ _ 2 
Atot — P , 



P+,(2.33) 



Px,(2.34) 



, (2.35) 

(2.36) 
(2.37) 



Note that in (2.331 and (2.34) the antenna responses and 



waveforms are defined in the DPF. Eqn. (2.35) is actually 
independent of the polarization basis used. 

The mean and st andard deviation of the non-central 
distribution ( 2.32| are {2NpDp,.oj + A) and i/2Api5~. 
Consequently, one expects a signal to be detectable by a 
given coherent statistic when 



A 



y27V^ 



> 1, 



(2.38) 



Table [TT| shows the mean and standard deviation of var- 
ious energy measures when the correct sky position and 
the time-frequency region are known a priori. 

For a circularly polarized or unpolarized gravitational 
wave, pI- / p"^ o± e <^ 1 ioi typical sky positions. For 
example, for the LIGO- Virgo network of detectors Hl- 
H2-L1-V1, assuming H2 is half as sensitive as HI, LI, 
and VI, the median value of e is 0.1, while for the LIGO 
network H1-H2-L1 the median is 0.02. As a consequence, 
for many signals is negligible. (An exception is lin- 
early polarized GWBs; for these the random polarization 
angle can make > P+ in the H1-H2-L1 network for 
approximately 10% of signals for a typical sky position.) 
Since all of the energies except Ex in Table [h] include p^, 
their relative performance is dominated by the level of 
noise fluctuations. The noise fluctuations in the energies 
scale as the square-root of the number of orthogonal di- 
rections used to compute the energy. As a consequence. 
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we expect those statistics that project the data down 
to fewer dimensions to perform better for GWB detec- 
tion. For the data is projected onto a single direction. 
i?sL and -Esoft use data along two directions, and so have 
higher noise. The total energy E'tot uses all of the data 
and therefore incorporates the largest contributions from 
noise. In practice, coherent consistency tests (discussed 
in the next section) can be used to reduce the noise back- 
ground, allowing statistics like Esl to be used effectively, 
so that all of the signal-to-noise ratio of a GWB (p^ and 
) can be included in the detection statistic. 



F. Incoherent Energies and Background Rejection 

The various likelihood measures E^l, £'+, etc- are mo- 
tivated as detection statistics under the assumption of 
stationary Gaussian background noise. Real detectors 
do not have purely Gaussian noise. Rather, real detec- 
tor noise contains glitches, which are short transients of 
excess strain that can masquerade as gravitational-wave 
burst signals. In practice, without a means to distin- 
guish noise glitches from true GW signals, the sensitivity 
of a burst search will be limited by such glitches. Co- 
herent analyses can be particularly susceptible to such 
false alarms, since even a glitch in a single detector will 
produce large values for likelihoods such as Esl- In this 
section we outline a technique for the effective suppres- 
sion of such false alarms in coherent analyses. 

As shown in Chattcrji et al. |14| . one can use the auto- 
correlation component of coherent energies to construct 
tests that are effective at rejecting glitches. This coher- 
ent veto test is based on the null space - the subspace 
orthogonal to that used to define the standard likelihood. 
The projection of d on this subspace contains only noise, 
and the presence or absence of GWs should not affect 
this projection in any way. By contrast, glitches do not 
couple into the data streams with any particular relation- 
ship to F~^,F^. As a result, glitches will generally be 
present in the null space projection. This provides a way 
to distinguish true GWs from glitches, by requiring the 
null energy to be small for a transient to be considered a 
GW [28]. 

To see how an effective test can be constructed, note 



that we can write equation (2.20) for the null energy as 



-EnuU — 



k a,P 



(2.39) 



auto-correlation components: 

£^null — -^null = ^ ^ ^aa'Mal 



(glitches) (2.40) 



This auto-correlation part of the null energy is called the 
incoherent energy. 

By contrast, for a GW signal, the transient is corre- 
lated between the detectors according to equations (2.1 ) 
or (2.9|-(2.11|. By construction of the null projection 



operator, these correlations cancel in the null stream, 
leaving only Gaussian noise. They cannot cancel in 
/null, however, since that is a purely incoherent statis- 
tic. Therefore, for a strong GW signal we expect 



E„ 



(GW) 



(2.41) 



Based on these considerations, the coherent veto test in- 
troduced by Chatterji et al. [H] is to keep only transients 
with 



-fnullZ-Enull > C, 



(2.42) 



where C is some constant greater than 1. This test is par- 
ticularly effective at eliminating large-amplitude glitches. 
For smaller amplitude glitches i?nuii can be small com- 
pared to /null due to statistical fluctuations; for this rea- 
son in X-PiPELiNE we use a modified test where the effec- 
tive threshold C varies with the event energy, as discussed 
in Section Imp] 

Analogous tests can be imposed on the other coher- 
ent energies, E+, Ex, etc.. We define the corresponding 
incoherent energies by 



/x = 



/si 



k a 



fe a 



EE 

k a 



Ip^ft 



e'^d 



(2.43) 
(2.44) 
/+ + /x . (2.45) 



In each case, we compare the coherent energy E to its 
incoherent counterpart /, making use of the expectation 
that for a glitch, E ~ I. For a strong GW, the signal 
summed over both polarizations should build coherently, 
so one will find 



/?SL > /; 



SL • 



(GW) 



(2.46) 



By contrast, one may find /?+ > or E^ < /+ depend- 
ing on the polarization of the GW signal. Specifically, if 
the GW signal is predominantly in the -I- polarization in 
the DPF, then one will find 



As pointed out in Chatterji et al. [14,, the null energy > /+ 

is composed of cross-correlation terms djjd^ and auto- Ex < Ix 

correlation terms d\d^. If the transient signal is not cor- 
related between detectors (as is expected for glitches), 
then the cross-correlation terms will be small compared 
to the auto-correlation terms. As a consequence, for a < 
glitch we expect the null energy to be dominated by the Ex > Ix 



(signal predominantly /i+) . (2.47) 



If the GW signal is predominantly in the x polarization 
in the DPF, then one will find the reverse: 



(signal predominantly hx) - (2.48) 
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TABLE II: Expected signal and noise contributions to various coherent energies. The mean(2_E) column shows the contribution 
to the mean energy due to a gravitational wave, evaluated in the dominant polarization frame. See equations (2.311, (2.331- 
(2.361. The std(2i5) column shows the standard deviation due to noise fluctuations assuming a non-aligned detector network 
(i.e., 1/*^ I > 0). The values for i?soft are written as approximate because the weighting factor e is itself a function of frequency. 
All of these values assume the time-frequency region to sum over and the correct sky location are known a priori. 



In general, a GW will be characterised by at least one 
of > /+ or Ex > Ix\ i-G-, at least one of the polar- 
izations will show a coherent buildup of signal-to-noise 
across detectors. This allows us to impose coherent glitch 
rejection tests even in the case where a null stream is not 
available, such as the Hl-Ll network of LIGO detectors. 
Specific examples of coherent consistency tests are dis- 
cussed in Sections IlII Dl and IIVI 

These incoherent energies are not defined as the mag- 
nitude of a projection. As a result, they do not obey 
statistics. They do, however, obey a simple relation with 
the coherent energies: 



Ex 



(2.49) 



Equivalcntly, the sum of the cross-correlation contribu- 
tions to Ej^ , Ex , and i?nuii cancel: 

= {E+ - 1+) + (Ex - Ix) + {Enun - /null) 

= {EsL - Isl) + {E^nll - Inun) . (2.50) 

III. OVERVIEW OF X-PIPELINE 

X-PiPELiNE is a MATLAB-based software package 
for performing coherent searches for gravitational-wave 
bursts in data from arbitrary networks of detectors. In 
this section we give an overview of the main steps fol- 
lowed in a triggered burst search, describing how the data 
is processed and how candidate GWBs are identified. In 
Section|V]we discuss how an X-Pipeline analysis is trig- 
gered. 

A. Preliminaries 

X-PiPELiNE performs the coherent analyses described 
m Section [n] The user (a human or automated triggering 
software) specifies: 

1. a set of detectors; 

2. one or more intervals of data to be analysed; 



3. a set of coherent energies to compute; 

4. a set of sky positions; and 

5. a list of parameters (such as FFT lengths) for the 
analysis. 

In standard usage, X-Pipeline processes the data and 
produces lists of candidate gravitational- wave signals for 
each of the specified sky positions. It does this by first 
constructing time-frequency maps of the various ener- 
gies in the reconstructed h+, hx, and null streams. X- 
Pipeline then identifies clusters of pixels with large val- 
ues of one of the coherent energies, such as E'sl or 



B. Time- frequency maps 

X-PiPELiNE typically processes data in 256 s blocks. 
First, it loads the requested data. It constructs a zero- 
phase linear predictor error filter to whiten the data and 
estimate the power spectrum [lH HO] . For each sky po- 
sition, X-PiPELiNE time-shifts the data from each de- 
tector according to equations (2.1) and (2.2 1. The data 



is divided into overlapping segments and Fourier trans- 
formed, producing time-frequency maps for each detec- 
tor. Given the time-frequency maps for the individual 
detector data streams d, X-Pipeline coherently sums 
and squares these maps in each pixel to produce time- 
frequency maps of the desired coherent energies; see Fig- 
ure [2] This representation gives easy access to the tem- 
poral evolution of the spectral properties of the signal, 
and all statistics and other quantities that are functions 
of time and frequency. 



C. Clustering and Event Identification 

Given time-frequency maps of each of the coher- 
ent energies, the challenge is then to identify potential 
gravitational-wave signals in these maps. 

The approach used in X-Pipeline is pixel clustering 
[IS] . The user singles out one of the energy measures - 
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FIG. 2: A simulated 1.4Mq-10.0Mq neutron star black hole 
inspiral at an effective distance of 37 Mpc, added to simulated 
noise from the two LIGO-Hanford detectors, (top) Time- 
frequency map of the energy, (bottom) The highest 1% 
of pixels highlighted. The inspiral "chirp" is clearly visible. 



typically -Esli the summed energy in the reconstructed 
/i+ and hx streams - as the detection statistic. A thresh- 
old is apphed to the detection statistic map so that a 
fixed percentage {e.g., 1%) of the pixels with the highest 
value in the current map are marked as black pixels; see 
Figure [2j Following the method of [18 , black pixels that 
share a common side (nearest neighbors) are grouped to- 
gether into clusters; see Figure [s] for an example. (As 
allowed in [18| . the user may specify a different connec- 
tivity criterion, such as next-nearest neighbors, or apply 
the "generaUzed clustering" procedure.) This clustering 
technique is appropriate for a GWB whose shape in the 
time-frequency plane is connected, as opposed to consist- 
ing of well-separated "blobs". This assumption is valid 
for many well-modeled signals such as low-mass inspirals 
and ringdowns. 

Each cluster is considered a candidate detection event. 
Each is assigned a detection statistic value from its con- 
stituent pixels by simply summing the values of the 
statistic in the pixels. This is motivated by the addi- 
tive property of the log-likelihood ratio - the inherited 
detection statistic is exactly the detection statistic for 
the area defined by the cluster. Each cluster is also as- 
signed an approximate statistical signi ficance S based on 
the distribution; see equation (2.32). This significance 



is used when comparing different clusters to determine 
which is the "loudest" - the best candidate for being a 
gravitational wave signal. Finally, the energy at the same 
time-frequency locations in maps of each of the other 



FIG. 3: A time frequency map with "black" pixels grouped 
into 3 clusters. Nearest-neighbor black pixels (those that 
share an edge) are grouped into a single cluster. Each clus- 
ter in this image is denoted by a different color and hatching 
pattern. 



requested likelihoods is also computed and recorded for 
each cluster. 

The clusters are saved for later post-processing. The 
analysis of time shifting (see equation (2.2)), FFTing, 
and cluster identification is then repeated for each of the 
other sky positions and for each of the requested EFT 
lengths. 

One other important feature of the time-frequency 
maps is the Fourier transform length or analysis time T, 
which determines the aspect ratio of the pixels. A longer 
time gives pixels with poor time resolution but good fre- 
quency resolution; a shorter time gives pixels with good 
time resolution but poor frequency resolution. Depend- 
ing on the signal duration, different analysis times may 
be optimal. Since each pixel has the same noise distri- 
bution (assuming Gaussian statistics), the optimal pixel 
size is the size for which the signal spans the smallest 
number of pixels, so that the statistic is least polluted by 
noise. 

Since the optimal analysis time for the incoming sig- 
nal is not known, X-PiPELiNE uses several analysis times, 
and applies a second layer of clustering between analy- 
sis times. For this second layer of clustering, clusters 
made from black pixels at two different analysis times 
that overlap in time and frequency are compared. The 
cluster that has the largest significance is kept as a can- 
didate event; the less significant overlapping clusters are 
discarded. 



D. Glitch rejection 

As noted in Section noise glitches tend to have a 
strong correlation between each coherent energy -Bnuii, 
Ex, and its corresponding incoherent energy /nuii, 
I+, Ix. X-PiPELiNE compares the coherent and incoher- 
ent energies to veto events that have properties similar to 
the noise background. These coherent veto tests are ap- 
plied in post-processing [i.e., after candidate events from 
the different analysis times are generated and combined) . 

Two types of coherent veto are available in X- 
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Pipeline. Both are pass/fail tests. The simplest is a 
threshold on the ratio I /E. Following the discussion in 
Section [ITF| a cluster passes the coherent test if 

/Eanl\ > fnull, (3.1) 

|logi„(/+/£;+)| > logio(r+), (3.2) 
|logio(/x/i?x)| > logio(rx), (3.3) 



where the thresholds 



and Tx may be speci- 



fied by the user or chosen automatically by X-Pipeline. 
The form of equations (3.2) and (3.3) make these tests 



two-sided; i.e., they pass clusters that are sufficiently far 
above or below the diagonal. 




FIG. 4: /null vs. -Enuii for clusters produced by background 
noise and by simulated gravitational- wave signals (□). 
The color axis is the base- 10 logarithm of the cluster signif- 
icance S. Loud glitches are vetoed by discarding all clusters 
that fall below the dashed line. 



The second type of coherent veto test in X-Pipeline is 
called the median-tracking veto test. In this test, the ex- 
clusion curve is nonlinear and designed to approximately 
follow the measured distribution of background clusters. 

Examination of scatter plots of / vs. E for background 
clusters shows that, while / ~ £^ for loud glitches, there 
is a bias to / > i? at low amplitudes. Furthermore, the 
width of the distribution of background events around 
the diagonal varies with E. A simple scaling argument 
shows that for large-amplitude uncorrelated glitches we 
expect 



{{E-lf)^I, 



(3.4) 



Specifically, for a large single-detector glitch g{f), the 
correlation with the noise fi{f) in another detector will 
have mean zero and variance oc \g\'^ oc /. Consequently, 
we expect noise events to be scattered about the diago- 
nal with a width that is proportional to l'^/'^ (recall that 



the energies are dimensionless quantities). The median- 
tracking test uses this information by estimating the me- 
dian value of / as a function of E for background events. 
For each cluster to be tested, it computes the following 
simple measure of how far the cluster is above or be- 
low the median: 



mcd 



Jl/2 



An event is passed if 



"null > ''null 

|n+| > r+ 
|«x| > fx • 



(3.5) 



(3.6) 
(3.7) 
(3.8) 



As in the ratio test, the thresholds for each energy type 
are independent and may be specified by the user or se- 
lected automatically by X-Pipeline. 

The median function Imcd{E) is estimated as follows. 
First, a set of background clusters are binned in logj^g E 
and the median values of log^Q E and logj^Q / in each bin 
are measured. A quadratic curve of the form 



logic I = a(logio Ey 



(3.9) 



is fit to these sampled medians. The quadratic is merged 
smoothly to the diagonal I = E above some value of E. 
This shape is entirely ad hoc, but in practice it provides 
a good fit the observed distribution of glitches. 

An example of the median-tracking coherent glitch 
veto is shown in Figure [Ij Each plus symbol (-I-) denotes 
a background cluster, colored by its significance log^p S. 
The large mass of light points at lower left are weak back- 
ground noise events. The darkly colored points extending 
along the diagonal to upper right are strong background 
noise events. Also shown are clusters due to a series of 
simulated gravitational-wave signals added to the data, 
denoted by squares (□). Even though many of these 
simulated signals are weaker (lighter) than the strong 
background noise glitches, they are well separated from 
the background noise population in the two-dimensional 
(-E-nuii, -^nuii) space. The dashed line shows the coherent 
veto threshold placed on (i?nuii, Aiuii); points below this 
line are discarded. Scatterplots of /+ vs. E+ and Ix 
vs. Ex have similar appearance; see Section IV for ex- 
amples. 

In addition to the coherent glitch vetoes, clusters may 
also be rejected because they overlap data quality vetoes. 
These are periods when one or more detectors showed evi- 
dence of being disturbed by non-gravitational effects that 
are known to produce noise glitches. Such sources include 
environmental noise and instabilities in the detector con- 
trol systems. These data quality vetoes are defined by 
studies of the data independently of X-Pipeline, and 
hence are outside of the scope of this report. See [Tn[T2] 
for recent reviews of data quality and detector character- 
isation efforts in the LIGO Scientific Collaboration and 
the Virgo Collaboration. 
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E. Triggered search: tuning and upper limits 

We now focus on the strategy for conducting trig- 
gered searches with X-Pipeline, specifically searches for 
gravitational waves associated with gamma-ray bursts 
(GRBs). As pointed out by Hayama et al. [25., GRB 
searches are an excellent case for the application of coher- 
ent analysis, since the sky position of the source is known 
a priori to high accuracy. We can therefore take full ad- 
vantage of coherent combinations of the data streams 
without the false-alarm or computational penalties of 
scanning over thousands of trial sky directions. 



1. Detection Proeedure 

For the purposes of a search for unmodelled 
gravitational-wave emission, a GRB source is charac- 
terised by its sky position 17, the time of onset of gamma- 
ray emission (the trigger time) to, and by the range of 
possible time delays At between the gamma-ray emission 
and the associated gravitational-wave emission. The lat- 
ter quantity is referred to as the on-source window for the 
GRB; this is the time interval which is analysed for candi- 
date signals. LIGO searches for gravitational wave bursts 
associated with GRBs [101 SIl 112] have traditionally used 
an asymmetric on-source window of [to — 120 s, to -I- 60 s], 
which is conservative enough to encompass most theoret- 
ical models of gravitational-wave emission for this source, 
as well as uncertainties associated with to [31 HI] . 

In order to claim a detection of a gravitational wave, 
we need to be able to establish with high confidence that 
a candidate event is statistically inconsistent with the 
noise background. In X-Pipeline GRB searches, we use 
the loudest event statistic [131131] to characterise the out- 
come of the experiment. The loudest event is the cluster 
in the on-source interval that has the largest significance 
(after application of vetoes); let us denote its significance 
by iSJ^ax- We compare 5,°"^ to the cumulative distri- 
bution C(5inax) of loudest significances measured using 
background noise (discussed below). We set a thresh- 
old on C(5max) such that the probability of background 
noise producing a cluster in the on-source interval with 
significance above this threshold is a specific small value 
(for example, a 1% chance). The on-source data is then 
analysed. If the significance C(5j™x) of ttie loudest clus- 
ter is greater than our threshold, we consider the cluster 
as a possible gravitational wave detection. We can also 
set an upper limit on the strength of gravitational-wave 
emission associated with the GRB in question. 

In principle, the cumulative distribution C(iSniax) of 
loudest-event significances for clusters produced by Gaus- 
sian background noise can be estimated a priori. In 
practice, however, real detector data is non-Gaussian. 
The most straightforward procedure for estimating the 
background distribution is then simply to analyse addi- 
tional data from times near the GRB, but outside the on- 
source interval. These data are referred to as off source. 



The off-source clusters will not contain a gravitational- 
wave signal associated with the GRB, and so they can 
be treated as samples of the noise background. In X- 
PiPELiNE, we divide the off-source data into segments 
of the same length as that used for the on-source data, 
and analyse each segment in exactly the same manner as 
the on-source data (using, for example, the same source 
direction relative to the detectors for computing coher- 
ent combinations). For each segment, we determine the 
significance of the loudest event after applying vetoes. 
This collection of loudest-event significances from the off- 
source data then serves as the empirical measurement of 

(^(^max)- 

In X-PiPELiNE we typically set the off-source data to 
be all data within ±1.5 hours of the GRB time, excluding 
the on-source interval. This time range is limited enough 
so that the detectors should be in a similar state of op- 
eration as during the GRB on-source interval, but long 
enough to provide typically ~50 off-source segments for 
sampling C(iSmax), thereby allowing estimation of proba- 
bilities as low as ^2%. To get still better estimates of the 
background distribution, we also analyse off-source data 
after artificially time-shifting the data from one or more 
detectors by different amounts ranging from a few sec- 
onds to several hundred seconds. These shifts can give 
up to approximately 1000 times the on-source data for 
background estimation, allowing estimation of probabil- 
ities at the sub-1% level. 

Networks containing both the LIGO-Hanford detec- 
tors, HI and 112, present a special case for background 
estimation, as local environmental disturbances can pro- 
duce simultaneous background glitches which are not ac- 
counted for in time slides. We therefore do not time-shift 
HI relative to H2 unless they are the only detectors op- 
erating. In that case, the local probability is computed 
both with and without time slides to allow a consistency 
check on the background estimation. (Triggered searches 
with second-scale on-source windows have the advantage 
of not requiring time shifts at all; see for example j45|.) 
In practice, we do not see significant differences due to 
correlated environmental disturbances. We attribute this 
robustness to the coherent glitch rejection tests described 
in Section HTlDl 



2. Upper Limits 

The comparison of the largest significance measured 
in the on-source data, to the cumulative distribu- 

tion C(iSinax) estimated from the off-source data allows 
us to determine if there is a statistically significant tran- 
sient associated with the GRB. If no statistically signifi- 
cant signal is present, we set a frequentist upper limit on 
the strength of gravitational waves associated with the 
GRB. For a given gravitational- wave signal model, we 
define the 90% confidence level upper limit on the signal 
amplitude as the minimum amplitude for which there is 
a 90% or greater chance that such a signal, if present 
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in the on-source region, would have produced a cluster 
with significance larger than the largest value 5^^^ actu- 
ally measured. 

We adopt the measure of signal amplitude that is stan- 
dard for LIGO burst searches, the root-sum-squared am- 
plitude /irss, defined by 



Kss = J dt[hl(t) + hl(t)].^ 



= y2y^ df [hlif) + hl{f)\ . (3.10) 

The units of hrss are Hz-i/2, the same as for amplitude 
spectra, making it a convenient quantity for comparing to 
detector noise curves. For narrow-band signals, the hj-ss 
can also be linked to the energy emitted in gravitational 
waves under the assumption of isotropic radiation via |46j 



77I1SO 



G 



2 

rss ' 



(3.11) 



where D is the distance to the source and /o is the dom- 
inant frequency of the radiation. One drawback of h-css 
is that it does not involve the detector sensitivity (either 
antenna response or noise spectrum). As a result, upper 
limits phrased in terms of h-^ss will depend on the family 
and frequency of waveforms used, and also on the sky 
position of the source. 

To set the upper limit, we need to determine how 
strong a real gravitational-wave signal needs to be in 
order to appear with a given significance. We do this 
using a third set of clusters, one which contains sample 
gravitational-wave signals. Specifically, we repeatedly re- 
analyse the on-source data after adding ("injecting") sim- 
ulated gravitational-wave signals to the data from each 
detector. The data is then analysed as before, produc- 
ing lists of clusters. The significance associated with a 
given injection is the largest significance of all clusters 
that were observed within a short time window (typically 
0.1s) of the injection time, after applying vetoes. 

The procedure for setting an upper limit is: 

1. Select one or more families of waveforms for which 
the upper limit will be set. For example, a com- 
mon choice in LIGO is linearly polarized, Gaussian- 
modulated sinusoids ( "sine-Gaussians" ) with fixed 
central frequency and quality factor, and random 
peak time and polarization angle. 



2. Find the significance S'^^^ of the loudest event 
in the on-source data, after applying the coherent 
glitch veto (Section HID) and any data-quality ve- 
toes. 

3. For each waveform family: 

(a) Generate random parameter values for a large 
number of waveforms from the family (e.^., 
specific peak times and polarization angles for 



(b) 



the sine-Gaussian case), and with fixed /irss 
amplitude. 

Add the waveforms one-by-one to the on- 
source data, and determine the largest signif- 
icance of any surviving cluster (after vetoes) 
associated with each injection. 

(c) Compute the percentage of the injections that 
have S > 5™^. 



(d) Repeat 3a - 3c using the same waveform fam- 
ily but with different /irss amplitudes. The 
90% confidence-level upper limit is that /ii-gg 
for which 90% of the injections have S > S[ 



on 

max' 



3. Tuning and Closed-Box Analyses 

The sensitivity of the pipeline is determined by the 
relative significance of the clusters produced by real 
gravitational-wave signals to those produced by back- 
ground noise. This in turn depends on the details of how 
the analysis is carried out. In particular, the thresholds 
used for the coherent glitch rejection tests will have a 
significant impact on the sensitivity. Too low a threshold 
will allow background noise glitches to survive, and pos- 
sibly appear louder than a real gravitational-wave signal. 
Too high a threshold may reject the gravitational-wave 
signals we seek. 

To improve the sensitivity of X-Pipeline searches, we 
tune the coherent glitch test to optimize the trade-off 
between glitch rejection and signal acceptance. We do 
this using a closed-box analysis. A closed-box analysis 
estimates the pipeline sensitivity using the off-source and 
injection data, but not the on-source data. This blind 
tuning avoids the possibility of biasing the upper limit. 

The procedure used for a closed-box analysis follows 
that used for computing an upper limit, except that an 
off-source segment is used as a substitute for the true 
on-source segment. We then test different thresholds for 
the coherent veto tests, and select the threshold set that 
gives us the best average "upper limit" estimated from 
the off-source segments. Specifically, we do the following: 

1. For each coherent veto test {E^ vs. /+, Ey. vs. /x, 
^-nuii vs. /null) we selcct a discrete set of trial veto 
thresholds to test. 

2. The off-source segments and the injection clusters 
are divided randomly into two equal sets: one for 
tuning, and one for upper-limit estimation. 

3. For each distinct combination of trial thresholds 
(r+, Tx , Tnuu), we do the following: 

(a) We apply the coherent veto test (and any 
data quality vetoes) to the background clus- 
ters from each of the tuning off-source seg- 
ments. The collection of loudest surviving 
events from each segment gives us C(5max) 
for that set of trial thresholds. 
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(b) We determine the ofF-source segment that 
gives the loudest event closest to the 95**^ per- 
centile of the off-source iSmax (i-e-, closest to 
C{Sn'Lax) — 0.95). This off-source segment is 
termed the dummy on-source segment. (Dif- 
ferent background segments may serve as the 
dummy on-source for different trial values of 
the coherent veto thresholds.) 

(c) The dummy on-source clusters and the tun- 
ing injection clusters are read, and the coher- 
ent vetoes and data-quality vetoes are applied 
to each. The upper limit is computed, treat- 
ing the dummy clusters as the true on-source 
clusters. 

4. The final, tuned veto thresholds are the ones that 
give the lowest upper limit based on the dummy 
on-source clusters. (If testing multiple waveform 
families, the upper limits may be averaged across 
families for deciding the optimal tuning.) 

5. To get an unbiased estimate of the expected upper 
limit, we apply the tuned vetoes to the second set 
of off-source and injection clusters, that were not 
used for tuning. Steps |3a| - |3c] are repeated using 
the final thresholds, and using the 50"^ percentile of 
i^max to choose the dummy on-source segment. The 
upper limit estimated from the dummy on-source 
segment in this second data set is the predicted 
upper limit for the GRB; equivalently, it may be 
interpreted as the sensitivity of the search. 

We choose the 95**^ percentile of 5max for tuning to focus 
on eliminating the tail of high-significance background 
glitches. This is a deliberate choice, since to be accepted 
as a detection, a GWB will need to stand well clear of the 
background. We choose the 50*^ percentile of iSmax as the 
dummy on-source value for sensitivity estimation because 
this is our best prediction for the typical value of iSmax in 
the on-source data under the null hypothesis. Separate 
data sets are used in tuning and sensitivity estimation to 
avoid bias from tuning the cuts on the same data used 
to estimate the sensitivity. The data set used for closed- 
box sensitivity estimates is later re-used for computing 
event probabilities and upper limits for the "open-box" 
(true on-source) data; this introduces no bias because 
no tuning decisions are made based on the closed-box 
sensitivity estimate. 

In X-PlPELlNE, the tuning and upper limit calcula- 
tions are automated. The closed-box analysis is per- 
formed first using a pre-selected range of trial thresholds 
for the coherent glitch test. A web page is generated au- 
tomatically reporting the details of the closed box analy- 
sis, including the optimized threshold values and the pre- 
dicted upper limits. For the S5/VSR1 search, the user 
re-runs the post-processing on the on-source data with 
the fixed optimized thresholds, and another web page re- 
port is generated listing detection candidates and upper 
limits. For the S6/VSR2 search, we propose to automate 



this "box opening" as well, so that the on-source events 
are scanned for candidate GWBs immediately once the 
closed-box tuning analysis has finished. 



4- Statistical and Systematic Errors 

There are several sources of error that can affect our 
analysis. The principal ones are calibration uncertainties 
(amplitude and phase response of the detectors, and rel- 
ative timing errors), and uncertainty in the sky position 
of the GRB. 

X-PiPELiNE is able to account for these effects auto- 
matically in tuning and upper limit estimation. Specif- 
ically, X-Pipeline's built-in simulation engine for in- 
jecting GWB signals is able to perturb the amplitude, 
phase, and time delays for each injection in each detec- 
tor. The perturbations are drawn from Gaussian distri- 
butions with mean and variance matching the calibration 
uncertainties for each detector. Furthermore, the GRB 
sky position can be perturbed in a random direction by 
a Gaussian-distributed angle with standard deviation set 
to the GRB error box width reported by the GCN. Tun- 
ing and upper limits based on the perturbed injections 
are effectively marginalized over these sources of error. 

For the S5-VSR1 GRB search, the capability for per- 
turbed injections was not available at the time of the 
original data analysis, and so the impact of the errors 
was estimated by re-analysis of a small subset of the full 
GRB sample. For the S6-VSR2 search, we include cali- 
bration and sky-position uncertainties in simulations for 
all GRBs from the beginning, removing the need to do 
any additional error analysis. 



IV. GRB 031108 

GRB 031108 [17] was a long GRB observed by 
Ulysses, Konus-Wind, Mars Odyssey-HEND and GRS, 
and RHESSI. As observed by Ulysses, it had a dura- 
tion of approximately 22 seconds, a 25-100 keV fluence 
of approximately 2.5 x 10~^ erg/cm^, and a peak flux of 
approximately 1.8 x 10~^ erg/cm^ s over 0.50 seconds. It 
was triangulated to a 3-sigma error box with approximate 
area 1600 square arcminutes with center coordinates 4h 
26m 54.86s, -5° 55' 49.00". 

GRB 031108 occurred during the third science run of 
the LIGO Scientific Collaboration ("S3"). At that time, 
the two LIGO-Hanford detectors HI and 112 were operat- 
ing, while the Livingston detector LI was not. A search 
for gravitational waves associated with the GRB was per- 
formed using a cross-correlation algorithm, and reported 
in Abbott et al. 02]. 

To demonstrate X-Pipeline, we perform a closed-box 
analysis (54| of the LIGO H1-II2 data to search for gravi- 
tational waves associated with GRB 031108. We tune the 
search and estimate its sensitivity to gravitational-wave 
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emission as discussed in Section III E using the same sim- 
ulated waveforms as in Abbott et al. We compare the 
sensitivity results to those of the cross-correlation search 
in Abbott et al. We estimate the 90% confidence upper 
limits from X-PiPELiNE to be typically 40% lower than 
those from the cross-correlation search. 



A. Analysis 

At the time of GRB 031108, the two LIGO Hanford 
detector HI and H2 were operating. Figure [5] shows the 
noise level in the detectors at that time. 




frequency (Hz) 



FIG. 5: Noise spectra of the HI and H2 detectors at the time 
of GRB 031108 as estimated by X-Pipeline. 



Since the HI and H2 detectors have identical antenna 
responses, the network is sensitive to only one of the two 
gravitational-wave polarizations from any given sky di- 
rection. In the DPF, this means that = 0. As a 
consequence, the cross energy also vanishes identically, 
Ex = 0, and Esl = Each event cluster is therefore 
characterised by the two coherent energies and E'nuii, 
and their associated incoherent components /_|_ and /nuu- 
Figure [g] shows the weighting factors e+ as a function of 
frequency. 

X-PiPELiNE was run on all data within ±lhr of the 
GRB time for background estimation. Clusters were gen- 
erated using Fourier transform lengths of l/8s, l/16s, 
l/32s, l/64s, 1/128S, and l/256s. Figure [t] shows scatter 
plots of /+ vs. E+ and /nuii vs. i?nuii for the half of the off- 
source clusters that were used for upper limit estimation 
(i.e., after tuning). Also shown are the clusters produced 
by simulated sine-Gaussian GWBs at 150 Hz, one of the 
types tested in |42]. These injections had amplitudes of 
6.3 X 10~^^Hz~^^^, approximately equal to the hrss upper 
limit estimated from the closed-box analysis. 

As expected, loud background triggers fall close to the 
diagonal in both of these plots. The simulated gravita- 
tional waves also fall close to the diagonal for vs. £'+; 




frequency (Hz) 



FIG. 6: Normalized contributions to = /\f^\ (in the 
DPF). Both detectors have the same antenna response, so the 
coherent weighting at each frequency is determined entirely by 
the relative noise spectra. Because = for this network, 
the normalized contribution of detector i to the null stream 
is simply 1 — (ef)^; i.e., identical to this figure with the HI 



and H2 curves swapped. 



this is due to the fact that H2 is significantly less sen- 
sitive than HI and so receives very little weighting in 
the calculation of E^. In turn, this means that the Hl- 
H2 cross terms in _E+ are small compared to the Hl-Hl 
term, so that E^ is dominated by the diagonal compo- 
nents and so is very similar to /+. For the null stream, 
however, the weightings are reversed, and H2 is weighted 
higher than HI. As a consequence, gravitational waves 
lie above the diagonal in the I^uii vs. E'nuii plot, and it is 
possible to separate the injections from the background 
clusters in (i?nuih^nuii) space. X-Pipeline's automated 
tuning procedure recognizes both of these facts; when run 
using the median-tracking veto test, it estimates that the 
best sensitivity will come from requiring a threshold of 
r-|_ = 5 on (-Enuii, -^nuii), and imposing no condition on 
/+ vs. E^. The (ii'nuii, -^nuii) threshold is indicated in 
Figure [7] by the dashed line; points below this line are 
discarded. As can be seen, this test rejects the majority 
of the loud off-source clusters, while accepting most of 
the simulated gravitational wave clusters. The off-source 
clusters that survive the test tend to be of low signifi- 
cance, and therefore will not affect the loudest-event up- 
per limit. Figure [8] shows the distribution of iSmax before 
and after the null-stream test. 



The closed-box analysis discussed in Section \UI E| was 
used to tune the coherent veto test and estimate the ex- 
pected upper limit from X-Pipeline. Figure [9] shows 
a scatter plot of the "dummy" on-source clusters. Re- 
call that the dummy on-source region is selected as the 
background segment that gives the median loudest event 
surviving the coherent veto test. It therefore represents 
the expected typical result under the null hypothesis, av- 
eraging over noise instantiations, and so is a more robust 
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FIG. 8: Distribution of the loudest event significance 5max 
seen in each of the off-source segments used for upper hmit 
estimation, before and after the coherent ghtch rejection test. 
Only 56 of the 391 off-source segments have events that sur- 
vive the test. 



better than that of the cross-correlation pipeline, corre- 
sponding to an increase in search volume of a factor of 
1.7'^ ~ 5. Similar improvements were seen in the open- 
box analysis of GRBs in the S5-VSR1 run (2005-2007) 

m 

As can been seen in Figure 10 the limiting amplitudes 
for this GRB track the noise spectrum of H2, and cor- 
respond to a matched-filter signal-to-noise ratio of ap- 



FIG. 7: Scatter plots of off-source (+) and simulation (□) 
cluster likelihoods: /+ vs. E+ (top) and Jnuii vs. iJnuii (bot- 
tom). The color denotes log]^Q(5). Loud background trig- 
gers fall close to the diagonal. Simulated gravitational waves 
also fall close to the diagonal for 7+ vs. E+, but above the 
diagonal for 7nuii vs. -Enuii. The dashed line denotes the co- 
herent consistency threshold on (-Bnuii, 7nuii) that is selected 
by X-Pipeline's automated tuning procedure; points below 
this line are discarded. This test rejects the majority of the 
loud off-source clusters, while accepting most of the simulated 
gravitational wave clusters, even if the GWB significiance is 
typical of background events. The simulated signals in this 
plot have hiss = 6.3 x 10~^^Hz~^''^, approximately equal to 
the upper limit estimated from the closed-box analysis. 



way to estimate the pipeline sensitivity than, e.g., picking 
a random segment (or even the on-source segment). 

The predicted /ij-ss upper limits at 90%-confidence for 
narrow-band sine-Gaussian waveforms of different cen- 

TO 



tral frequencies are shown in Table III and Figure 
Table |III| also shows the actual upper limits from the 
cross-correlation search reported in [42] . The predicted 
X-PiPELiNE sensitivity is approximately a factor of 1.7 
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FIG. 9: /null vs. iSnuii scatter plot of the dummy on-source 
(4-) and simulation (□) cluster likelihoods used to estimate 
the upper limit. The color denotes logj^Q(5). No background 
events survive the coherent consistency test. The simulated 
signals in this plot have h-css ~ 6.3 x 10~^^IIz~^''^, approxi- 
mately equal to the estimated upper limit. 
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frequency (Hz) 


100 


150 


250 


554 


1000 


1850 


cross-correlation 

X-PlPELINE 


18.4 
11.1 


11.3 
6.1 


10.9 
6.5 


12.5 
7.5 


20.4 
12.6 


51.5 
36.7 



TABLE III: Estimated ftrss amplitude upper limits from X-Pipeline and the best upper limits from the actual cross-correlation 
search [3^. The units are 10~^^Hz~^'^^. The simulated waveforms are circularly polarized sine-Gaussians as described in [42] . 



proximately 5 in H2. This occurs because the sensitivity 
of the analysis is Hmited by the coherent glitch rejec- 
tion test. This test requires a measurable correlation be- 
tween the detectors, which in turn requires that the GWB 
have some minimal signal-to-noisc ratio in each. This be- 
haviour is typical of tuning using the 95*^ percentile of 
^max, which is an aggressive choice designed to suppress 
the loud background. While the upper limits tend to be 
limited by such strong background rejection, our ability 
to detect a GWB is enhanced, since a GWB candidate 
will undoubtedly need a significance higher than some 
very high percentile of the background to be claimed as 
an actual gravitational wave. 

The factor of 1.7 sensitivity improvement of X- 
PiPELiNE relative to the cross-correlation search in [12] 
can be attributed in part to two factors. We esti- 
mate that a factor of approximately 1.3 comes from us- 
ing ii'sL rather than the cross-correlation as the detec- 
tion statistic. i?sL includes the auto-correlation terms 
(djjj^rfjj]^, 'ijj2<iH2) addition to the cross-correlation 
terms (<iHi'^H2) when combining the HI and H2 data 
streams. This gives a net increase in the signal-to-noise 
ratio. More precisely, one can compute the ratio of the 
expected contribution to Esh due a GWB to the stan- 
dard deviation in Esl due to Gaussian noise; see Sec- 
tion |TTE] Performing the same calculation for the cross- 
correlation statistic, one finds the per-pixel ratio for iJgL 



to be 1.8 ~ 1.3^ times larger than that for the cross- 
correlation (assuming a 2:1 ratio in the noise amplitudes 
for H2:H1). Another factor of ~ 1.2 can be attributed to 
the clustering, which restricts the likelihood calculation 
to pixels that show significant signal power (and thus 
tending to exclude pixels that contain only background 
noise). The cross-correlation statistic in [42^ was com- 
puted on a minimum time-frequency volume (number 
of pixels) of approximately 50. By contrast, the typi- 
cal cluster size in X-Pipeline was found to be 10-30 for 
injections at the 90% upper limit amplitude. As seen in 
Section HE and jl7j . the amplitude sensitivity in Gaus- 
sian noise scales as A^"^/^. The factor of ^ 2 smaller 
number of pixels used by X-Pipeline should therefore 
give a factor of ^ 2^/* = 1.2 sensitivity improvement. 
Combined with the previous factor of 1.3 gives a total 
improvement of about 1.6. While this is very close to the 
average measured improvement, one should keep in mind 
that these rough estimates have not properly accounted 
for the non-Gaussianity of the background (which will 
decrease the sensitivity of both pipelines) , or for the ten- 
dency of the coherent glitch rejection test to limit the 
X-Pipeline sensitivity in the absence of strong back- 
ground glitches. These other effects are presumably also 
important. 



AUTONOMOUS RUNNING 




frequency (Hz) 



FIG. 10: 90%-confidence level upper limits on the GW am- 
plitude (•) from X-PlPELlNE for narrow-band circularly po- 
larized sine-Gaussian bursts. The detector noise spectra are 
also shown for reference. 



X-PiPELiNE has been used to process data from S5- 
VSRl (2005-2007). This is an "offline" search, being 
completed almost two years after the last of the GRBs 
in question was observed. In parallel, X-Pipeline is be- 
ing improved for the S6-VSR2 run, which started in July 
2009. Our goal for S6-VSR2 is fully autonomous running, 
with a complete analysis of each GRB within 24 hours 
of the trigger. To achieve this goal requires automatic 
triggering of X-Pipeline. 



A. Automated launch of X-Pipeline by GCN 
triggers 

Most of the information for sources which are ana- 
lyzed by various externally triggered burst searches in 
LIGO- Virgo come from the GRB Coordinates Network 
(GCN) [48]. GCN notices and circulars are received in 
real time by LIGO- Virgo, and the information needed 
for the search analyses are parsed automatically by perl 
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scripts which are launched each time a GCN notice or cir- 
cular is received. The information parsed includes: the 
time and date of the event, the source position (right 
ascension and declination), the position error, and the 
duration of the event. For each source, these parameters 
are compiled and written to a trigger file. 

Concurrently, a perl script runs at a central comput- 
ing site and regularly checks if there are new source 
events listed in the trigger file. When there are new 
triggers, the script checks for availability of the LIGO- 
Virgo data which are necessary for analyzing the source. 
If the needed data are available, the script launches 
X-PiPELiNE event-generation jobs (which include sim- 
ulation and off-source analyses) on the computing clus- 
ter. These jobs are monitored continuously to automat- 
ically determine when the jobs have finished. Once they 
are completed, the post-processing (tuning and detec- 
tion/upper limit) jobs are automatically launched and 
likewise monitored. Successful completion of these steps 
results in a web page in which the results of the analy- 
sis are presented, and an email notification being sent to 
human analysts. Additionally, the scripts which monitor 
the status of the search and post-processing jobs log that 
progress for each source event and regularly write this 
information to a summary status web page. These GCN 
parsing and triggering scripts are now operational, and 
X-PiPELiNE is currently autonomously analysing GRBs 
from the Swift jlHI satellite. Open-box results are avail- 
able in as little as 6 hours following a GCN alert. 

Other modifications currently being made to X- 
PlPELlNE focus on the larger sky position error boxes 
from the Fermi satellite (SU]. For S6-VSR2, most of the 
GRB triggers come from the GBM instrument on Fermi, 
which gives a typical position uncertainty of several de- 
grees. This is much larger than the typical uncertainty 
of a few arcmin for GRBs from Swift in S5-VSR1. The 
X-PiPELiNE launch scripts are currently being modified 
to set up a grid of sky positions covering this error region, 
and the handling of events is being modified to minimize 
the additional computational time required. Finally, the 
suite of simulated waveforms has been expanded to in- 
clude binary neutron star and black-hole-neutron-star 
binary inspirals, since these systems are widely thought 
to be the progenitors of short GRBs. 

VI. SUMMARY 

X-PiPELiNE is a software package designed to per- 



form autonomous searches for gravitational-wave bursts 
associated with astrophysical triggers such as gamma- 
ray bursts. It performs a fully coherent analysis of 
data from arbitrary networks of detectors to sensitively 
search small patches of the sky for gravitational-wave 
bursts. X-PiPELiNE features automated tuning of back- 
ground rejection tests, and a built-in simulation engine 
with the ability to simulate effects such as calibration 
uncertainties and sky position errors. X-Pipeline can 
be launched automatically by receipt of a GCN email, 
performing a complete analysis of data, including tun- 
ing and identification of GWB candidates, without hu- 
man intervention. Each astrophysical trigger is analysed 
as a separate search, with background estimation and 
tuning performed using independent data samples local 
to the trigger. In a test on actual detector data for a 
real GRB, we find that X-Pipeline is sensitive to sig- 
nals approximately a factor of 1.7 weaker than those de- 
tectable by the cross-correlation technique used in previ- 
ous LIGO searches. X-Pipeline has recently been used 
for the analysis of GRBs from from the LIGO- Virgo S5- 
VSRl run, and is currently running autonomously during 
the S6-VSR2 run to search for gravitational waves asso- 
ciated GRBs observed electromagnetically. Our goal is 
the rapid identification of possible GWBs on time scales 
short enough to prompt additional follow-up observations 
by other observatories. 
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